Multi-Dimensional Database Allocation for Parallel Data Warehouses
نویسندگان
چکیده
Data allocation is a key performance factor for parallel database systems (PDBS). This holds especially for data warehousing environments where huge amounts of data and complex analytical queries have to be dealt with. While there are several studies on data allocation for relational PDBS, the specific requirements of data warehouses have not yet been sufficiently addressed. In this study, we consider the allocation of relational data warehouses based on a star schema and utilizing bitmap index structures. We investigate how a multi-dimensional hierarchical data fragmentation of the fact table supports queries referencing different subsets of the schema dimensions. Our analysis is based on realistic parameters derived from a decision support benchmark. The performance implications of different allocation choices are evaluated by means of a detailed simulation model.
منابع مشابه
WARLOCK: A Data Allocation Tool for Parallel Warehouses
We present the WARLOCK tool to automatically determine a parallel data warehouse’s allocation to disk. This GUIequipped tool is implemented in Java and utilizes an internal cost model and heuristics to determine a disk allocation minimizing both I/O work and query response times. WARLOCK recommends a ranked list of fragmentation candidates, a detailed query performance analysis and a tailored p...
متن کاملA Parallel Scalable Infrastructure for OLAP and Data Mining
Decision support systems are important in leveraging information present in data warehouses in businesses like banking, insurance, retail and health-care among many others. The multi-dimensional aspects of a business can be naturally expressed using a multi-dimensional data model. Data analysis and data mining on these warehouses pose new challenges for traditional database systems. OLAP and da...
متن کاملOnline Data Mining
INTRODUCTION Currently, most data warehouses are being used for summarizationbased, multi-dimensional, online analytical processing (OLAP). However, given the recent developments in data warehouse and online analytical processing technology, together with the rapid progress in data mining research, industry analysts anticipate that organizations will soon be using their data warehouses for soph...
متن کاملFuzzy multi-criteria selection procedures in choosing data source
Technology assessment and selection has a substantial impact on organizations procedures in regards to technology transfer. Technological decisions are usually made by a group of experts, and whereby integrity of these viewpoints to a single decision can be quite complex. Today, operational databases and data warehouses exist to manage and organize data with specific features and henceforth, th...
متن کاملCaching for Multi-dimensional Data Mining Queries
Multi-dimensional data analysis and online analytical processing are standard querying techniques applied on today’s data warehouses. Data mining algorithms, on the other hand, are still mostly run in stand-alone, batch mode on flat files extracted from relational databases. In this paper we propose a general querying model combining the power of relational databases, SQL, multidimensional quer...
متن کامل